Goto

Collaborating Authors

 layered network


Existence of Adversarial Examples for Random Convolutional Networks via Isoperimetric Inequalities on $\mathbb{so}(d)$

Daniely, Amit

arXiv.org Machine Learning

We show that adversarial examples exist for various random convolutional networks, and furthermore, that this is a relatively simple consequence of the isoperimetric inequality on the special orthogonal group $\mathbb{so}(d)$. This extends and simplifies a recent line of work which shows similar results for random fully connected networks.


Effects of Firing Synchrony on Signal Propagation in Layered Networks

Neural Information Processing Systems

Spiking neurons which integrate to threshold and fire were used to study the transmission of frequency modulated (FM) signals through layered networks. Firing correlations between cells in the input layer were found to modulate the transmission of FM sig(cid:173) nals under certain dynamical conditions. A tonic level of activity was maintained by providing each cell with a source of Poisson(cid:173) distributed synaptic input. When the average membrane depo(cid:173) larization produced by the synaptic input was sufficiently below threshold, the firing correlations between cells in the input layer could greatly amplify the signal present in subsequent layers. When the depolarization was sufficiently close to threshold, however, the firing synchrony between cells in the initial layers could no longer effect the propagation of FM signals.


On the Concentration of Expectation and Approximate Inference in Layered Networks

Neural Information Processing Systems

We present an analysis of concentration-of-expectation phenomena in layered Bayesian networks that use generalized linear models as the local conditional probabilities. This framework encompasses a wide variety of probability distributions, including both discrete and continuous random variables. We utilize ideas from large deviation analysis and the delta method to devise and evaluate a class of approximate inference algo- rithms for layered Bayesian networks that have superior asymptotic error bounds and very fast computation time.


Automating the Design and Development of Gradient Descent Trained Expert System Networks

Straub, Jeremy

arXiv.org Artificial Intelligence

Prior work introduced a gradient descent trained expert system that conceptually combines the learning capabilities of neural networks with the understandability and defensible logic of an expert system. This system was shown to be able to learn patterns from data and to perform decision-making at levels rivaling those reported by neural network systems. The principal limitation of the approach, though, was the necessity for the manual development of a rule-fact network (which is then trained using backpropagation). This paper proposes a technique for overcoming this significant limitation, as compared to neural networks. Specifically, this paper proposes the use of larger and denser-than-application need rule-fact networks which are trained, pruned, manually reviewed and then re-trained for use. Multiple types of networks are evaluated under multiple operating conditions and these results are presented and assessed. Based on these individual experimental condition assessments, the proposed technique is evaluated. The data presented shows that error rates as low as 3.9% (mean, 1.2% median) can be obtained, demonstrating the efficacy of this technique for many applications.


what_nns_learn.html

#artificialintelligence

Neural networks are famously difficult to interpret. It's hard to know what they are actually learning when we train them. Let's take a closer look and see whether we can build a good picture of what's going on inside. Just like every other supervised machine learning model, neural networks learn relationships between input variables and output variables. In fact, we can even see how it's related to the most iconic model of all, linear regression. Linear regression assumes a straight line relationship between an input variable x and an output variable y. x is multiplied by a constant, m, which also happens to be the slope of the line, and it's added to another constant, b, which happens to be where the line crosses the y axis. We can represent this in a picture. Our input value x is multiplied by m. Our constant b, is multiplied by one. And then they are added together to get y.


Kernels and Submodels of Deep Belief Networks

Montufar, Guido F., Morton, Jason

arXiv.org Machine Learning

We study the mixtures of factorizing probability distributions represented as visible marginal distributions in stochastic layered networks. We take the perspective of kernel transitions of distributions, which gives a unified picture of distributed representations arising from Deep Belief Networks (DBN) and other networks without lateral connections. We describe combinatorial and geometric properties of the set of kernels and products of kernels realizable by DBNs as the network parameters vary. We describe explicit classes of probability distributions, including exponential families, that can be learned by DBNs. We use these submodels to bound the maximal and the expected Kullback-Leibler approximation errors of DBNs from above depending on the number of hidden layers and units that they contain.


Modeling Time Varying Systems Using Hidden Control Neural Architecture

Levin, Esther

Neural Information Processing Systems

This paper introduces a generalization of the layered neural network that can implement a time-varying nonlinear mapping between its observable input and output. The variation of the network's mapping is due to an additional, hidden control input, while the network parameters remain unchanged. We proposed an algorithm for finding the network parameters and the hidden control sequence from a training set of examples of observable input and output. This algorithm implements an approximate maximum likelihood estimation of parameters of an equivalent statistical model, when only the dominant control sequence is taken into account. The conceptual difference between the proposed model and the HMM is that in the HMM approach, the observable data in each of the states is modeled as though it was produced by a memoryless source, and a parametric description of this source is obtained during training, while in the proposed model the observations in each state are produced by a nonlinear dynamical system driven by noise, and both the parametric form of the dynamics and the noise are estimated. The perfonnance of the model was illustrated for the tasks of nonlinear time-varying system modeling and continuously spoken digit recognition. The reported results show the potential of this model for providing high performance speech recognition capability. Acknowledgment Special thanks are due to N. Merhav for numerous comments and helpful discussions.


Modeling Time Varying Systems Using Hidden Control Neural Architecture

Levin, Esther

Neural Information Processing Systems

This paper introduces a generalization of the layered neural network that can implement a time-varying nonlinear mapping between its observable input and output. The variation of the network's mapping is due to an additional, hidden control input, while the network parameters remain unchanged. We proposed an algorithm for finding the network parameters and the hidden control sequence from a training set of examples of observable input and output. This algorithm implements an approximate maximum likelihood estimation of parameters of an equivalent statistical model, when only the dominant control sequence is taken into account. The conceptual difference between the proposed model and the HMM is that in the HMM approach, the observable data in each of the states is modeled as though it was produced by a memoryless source, and a parametric description of this source is obtained during training, while in the proposed model the observations in each state are produced by a nonlinear dynamical system driven by noise, and both the parametric form of the dynamics and the noise are estimated. The perfonnance of the model was illustrated for the tasks of nonlinear time-varying system modeling and continuously spoken digit recognition. The reported results show the potential of this model for providing high performance speech recognition capability. Acknowledgment Special thanks are due to N. Merhav for numerous comments and helpful discussions.


Modeling Time Varying Systems Using Hidden Control Neural Architecture

Levin, Esther

Neural Information Processing Systems

This paper introduces a generalization of the layered neural network that can implement a time-varying nonlinear mapping between its observable input and output. The variation of the network's mapping is due to an additional, hidden control input, while the network parameters remain unchanged. We proposed an algorithm for finding the network parameters and the hidden control sequence from a training set of examples of observable input and output. This algorithm implements an approximate maximum likelihood estimation of parameters of an equivalent statistical model, when only the dominant control sequence is taken into account. The conceptual difference between the proposed model and the HMM is that in the HMM approach, the observable data in each of the states is modeled as though it was produced by a memoryless source, and a parametric description of this source is obtained during training, while in the proposed model the observations in each state are produced by a nonlinear dynamical system driven by noise, and both the parametric form of the dynamics and the noise are estimated. The perfonnance of the model was illustrated for the tasks of nonlinear time-varying system modeling and continuously spoken digit recognition. The reported results show the potential of this model for providing high performance speech recognition capability. Acknowledgment Specialthanks are due to N. Merhav for numerous comments and helpful discussions.


Neural Network Recognizer for Hand-Written Zip Code Digits

Denker, John S., Gardner, W. R., Graf, Hans Peter, Henderson, Donnie, Howard, R. E., Hubbard, W., Jackel, L. D., Baird, Henry S., Guyon, Isabelle

Neural Information Processing Systems

This paper describes the construction of a system that recognizes hand-printed digits, using a combination of classical techniques and neural-net methods. The system has been trained and tested on real-world data, derived from zip codes seen on actual U.S. Mail. The system rejects a small percentage of the examples as unclassifiable, and achieves a very low error rate on the remaining examples. The system compares favorably with other state-of-the art recognizers. While some of the methods are specific to this task, it is hoped that many of the techniques will be applicable to a wide range of recognition tasks.